Skip to main content

Caching Patterns in System Design

πŸ“š Core Concepts​

Cache Hit vs Cache Miss​

TermDefinitionSpeedWhat Happens
Cache Hit βœ…Requested data found in cacheFastData retrieved directly from cache
Cache Miss ❌Requested data not in cacheSlowMust fetch from main storage, then optionally cache it

Analogy​

Think of cache as your desk drawer and main memory as a library shelf:

  • Cache hit: Find your notebook in the desk drawer β†’ instant access ⚑
  • Cache miss: Not in drawer β†’ walk to library shelf β†’ slower retrieval 🐌

πŸ”„ Caching Patterns​

1. Cache-Aside (Lazy Loading)​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Application β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
Check Cache?
β”œβ”€ Hit β†’ Return
└─ Miss β†’ Fetch DB β†’ Store in Cache β†’ Return

Characteristics:

  • Application manages cache explicitly
  • Cache populated on-demand
  • Most common pattern

Pros:

  • βœ… Only caches what's actually needed
  • βœ… Cache failure doesn't break the system
  • βœ… Flexible - app has full control

Cons:

  • ❌ First request always slow (cache miss)
  • ❌ Extra code in application layer
  • ❌ Potential for stale data

When to Use:

  • Read-heavy workloads
  • When you want fine-grained control
  • General-purpose caching (Redis, Memcached)

Real-World Example: E-commerce product catalog caching


2. Read-Through​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Application β”‚ ──Read──> β”Œβ”€β”€β”€β”€β”€β”€β”€β”
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ Cache β”‚ ──Auto Fetch──> Database
β””β”€β”€β”€β”€β”€β”€β”€β”˜

Characteristics:

  • Cache layer handles data loading
  • Transparent to application
  • Cache acts as abstraction over DB

Pros:

  • βœ… Simpler application code
  • βœ… Centralized cache logic
  • βœ… Consistent read interface

Cons:

  • ❌ First request still slow
  • ❌ Adds complexity to cache layer
  • ❌ Tighter coupling with cache system

When to Use:

  • When you want cache to own data loading
  • Frameworks that support it (Ehcache, Caffeine)
  • Microservices with dedicated cache service

Difference from Cache-Aside: Cache handles DB fetch vs application handles it


3. Write-Through​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Application β”‚ ──Write──┬──> Cache (sync)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ └──> Database (sync)

Characteristics:

  • Every write goes to both cache and DB
  • Synchronous double-write
  • Strong consistency guaranteed

Pros:

  • βœ… Cache always fresh and consistent
  • βœ… No risk of stale reads
  • βœ… Simple consistency model

Cons:

  • ❌ Higher write latency (two operations)
  • ❌ Wasted writes for rarely-read data
  • ❌ Cache can fill with cold data

When to Use:

  • Strong consistency requirements
  • Read-after-write scenarios common
  • Financial systems, user profiles

Real-World Example: User session data, account balances


4. Write-Behind / Write-Back​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Application β”‚ ──Write──> Cache (fast return) ~~async batch~~> Database
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Characteristics:

  • Writes happen to cache only
  • DB updated later in batches
  • Eventually consistent

Pros:

  • βœ… Extremely fast writes
  • βœ… Can batch/coalesce multiple writes
  • βœ… Reduces DB load significantly

Cons:

  • ❌ Risk of data loss if cache fails
  • ❌ Complexity in failure handling
  • ❌ Eventual consistency only

When to Use:

  • High write throughput needed
  • Acceptable to lose recent writes on failure
  • Analytics pipelines, logging systems

Real-World Example: Page view counters, metrics aggregation


5. Write-Around​

Write: Application ──────────> Database (bypass cache)
Read: Application ──> Cache? ──Miss──> Database ──> Cache

Characteristics:

  • Writes skip the cache entirely
  • Cache populated only on reads
  • Prevents cache pollution

Pros:

  • βœ… Avoids caching rarely-read data
  • βœ… Keeps cache focused on hot data
  • βœ… Better cache hit ratio for actual reads

Cons:

  • ❌ First read after write always misses
  • ❌ Higher latency for read-after-write
  • ❌ Not ideal for write-then-read patterns

When to Use:

  • High write volume, low read volume
  • Write-once-read-never scenarios
  • Log ingestion, data warehousing

Real-World Example: Event logging, audit trails


6. Refresh-Ahead (Proactive Caching)​

Cache monitors TTL ──> Preemptively refreshes BEFORE expiration

Characteristics:

  • Predictive cache warming
  • Reduces cache misses for hot data
  • Requires usage pattern prediction

Pros:

  • βœ… Minimizes cache misses
  • βœ… Consistent low latency
  • βœ… Great for predictable access patterns

Cons:

  • ❌ Wastes resources on cold data
  • ❌ Complex implementation
  • ❌ Needs good prediction algorithm

When to Use:

  • Frequently accessed data with predictable patterns
  • Low-latency requirements (gaming, trading)
  • Content delivery networks (CDN)

Real-World Example: Homepage content, trending articles


7. TTL (Time-to-Live) Based​

Cache Entry [Created] ──(time passes)──> [TTL Expires] ──> Auto-removed

Characteristics:

  • Time-based expiration
  • Simplest invalidation strategy
  • Combined with other patterns

Pros:

  • βœ… Simple to implement
  • βœ… Prevents indefinitely stale data
  • βœ… Works with any caching pattern

Cons:

  • ❌ Can cause cache miss storms at expiration
  • ❌ Arbitrary time selection
  • ❌ May evict still-valid data

When to Use:

  • Data with known freshness requirements
  • Combined with most caching strategies
  • Session tokens, temporary data

Real-World Example: API rate limiting, JWT tokens


πŸ—‘οΈ Eviction Policies​

LRU (Least Recently Used)​

  • Strategy: Evicts items not accessed recently
  • Best for: Temporal locality (recently used = likely to be used again)
  • Example: Web page caching

LFU (Least Frequently Used)​

  • Strategy: Evicts items accessed least often
  • Best for: Popular content, frequency-based access
  • Example: Video streaming platforms

FIFO (First In First Out)​

  • Strategy: Evicts oldest entries
  • Best for: Simple queue-like behavior
  • Example: Basic message queues

Random Replacement​

  • Strategy: Evicts random entries
  • Best for: When no clear pattern exists, lowest overhead
  • Example: Simple distributed caches

πŸ“Š Pattern Comparison Matrix​

PatternWrite SpeedRead SpeedConsistencyComplexityData Loss Risk
Cache-Aside🟑 Medium🟒 Fast*🟑 Eventual🟒 Low🟒 Low
Read-Through🟑 Medium🟒 Fast*🟑 Eventual🟑 Medium🟒 Low
Write-ThroughπŸ”΄ Slow🟒 Very Fast🟒 Strong🟒 Low🟒 None
Write-Back🟒 Very Fast🟒 Very Fast🟑 EventualπŸ”΄ HighπŸ”΄ High
Write-Around🟒 Fast🟑 Medium🟑 Eventual🟒 Low🟒 None
Refresh-Ahead🟑 Medium🟒 Very Fast🟑 EventualπŸ”΄ High🟒 Low

*After initial cache miss


🎯 Common Pattern Combinations​

High-Traffic Web Application​

Read Strategy: Cache-Aside + LRU eviction
Write Strategy: Write-Through for critical data
TTL: 5-15 minutes for most content
Tools: Redis, Memcached

Analytics Pipeline​

Read Strategy: Read-Through
Write Strategy: Write-Back (batch inserts)
Eviction: LFU (frequently queried reports)
Tools: Apache Ignite, Hazelcast

E-commerce Product Catalog​

Read Strategy: Cache-Aside + Refresh-Ahead for bestsellers
Write Strategy: Write-Around for inventory updates
TTL: 1 hour for product details
Tools: Redis with pub/sub for invalidation

Social Media Feed​

Read Strategy: Cache-Aside + Refresh-Ahead for active users
Write Strategy: Write-Back for likes/views
TTL: 30 seconds for feed items
Eviction: LRU
Tools: Redis Cluster

🌳 Decision Tree​

β”Œβ”€ Need strong consistency?
β”‚ β”œβ”€ YES β†’ Write-Through
β”‚ └─ NO ↓
β”‚
β”œβ”€ High write volume?
β”‚ β”œβ”€ YES ↓
β”‚ β”‚ β”œβ”€ Can tolerate data loss?
β”‚ β”‚ β”‚ β”œβ”€ YES β†’ Write-Back
β”‚ β”‚ β”‚ └─ NO β†’ Write-Around
β”‚ └─ NO β†’ Cache-Aside
β”‚
β”œβ”€ Need ultra-low read latency?
β”‚ └─ Add Refresh-Ahead
β”‚
└─ Cache filling up?
└─ Choose eviction:
β”œβ”€ Temporal patterns β†’ LRU
└─ Popularity-based β†’ LFU

βœ… Best Practices​

  1. Start with Cache-Aside

    • Most flexible and widely understood
    • Easy to debug and reason about
  2. Always Set TTL

    • Even with other invalidation strategies
    • Prevents unbounded cache growth
  3. Monitor Cache Hit Ratio

    • Aim for >80% for effectiveness
    • Alert on sudden drops
  4. Handle Cache Failures Gracefully

    • App should work even if cache is down
    • Implement circuit breakers
  5. Use Appropriate Serialization

    • Consider Protobuf/MessagePack over JSON
    • Faster and more compact
  6. Warm Critical Caches on Startup

    • Don't wait for cold starts
    • Pre-populate frequently accessed data
  7. Implement Cache Stampede Protection

    • Use locks/semaphores for cache misses
    • Prevent thundering herd
  8. Size Your Cache Appropriately

    • Monitor eviction rates
    • Balance memory cost vs hit rate

⚑ Performance Tips​

  • Batch operations when possible (especially with Write-Back)
  • Use pipeline/multi-get for multiple keys (Redis MGET, MSET)
  • Consider cache-aside for writes even with read-through for reads
  • Implement circuit breakers for cache failures
  • Use connection pooling for cache clients
  • Monitor P99 latencies, not just averages
  • Compress large values before caching
  • Use appropriate data structures (Redis Hashes, Sets, Sorted Sets)

⚠️ Common Pitfalls​

❌ Cache Stampede​

Problem: Multiple requests reload same expired data simultaneously

Solution:

  • Locking mechanisms (distributed locks)
  • Early recomputation (refresh before expiry)
  • Probabilistic early expiration

❌ Stale Data​

Problem: Cache inconsistent with database

Solution:

  • Proper TTL settings
  • Invalidation on writes
  • Event-driven cache updates

❌ Cache Pollution​

Problem: Rarely-used data fills cache

Solution:

  • Write-Around pattern
  • Better eviction policies (LRU/LFU)
  • Cache only frequently accessed data

❌ Over-caching​

Problem: Caching everything indiscriminately

Solution:

  • Profile and measure what to cache
  • Cache only expensive queries
  • Monitor cache hit rates per key pattern

❌ No Monitoring​

Problem: Not knowing hit rates, evictions, or issues

Solution:

  • Implement comprehensive metrics
  • Dashboard for cache health
  • Alerts for anomalies

❌ Ignoring Cache Warm-up​

Problem: Cold start causes poor initial performance

Solution:

  • Pre-populate cache on deployment
  • Gradual traffic ramping
  • Keep cache instances alive during deployments

πŸ“ˆ Key Metrics to Monitor​

MetricWhat It MeasuresTarget
Hit Rate% of requests served from cache>80%
Miss Rate% of requests requiring DB fetch<20%
Eviction RateHow often data is removedLow & stable
Memory UsageCache memory consumption<80% capacity
Latency (P50, P99)Response time distribution<10ms P99
ThroughputOperations per secondApplication dependent
Connection PoolActive connectionsStable
Error RateFailed cache operations<0.1%

In-Memory Caches​

  • Redis - Feature-rich, supports data structures, persistence
  • Memcached - Simple, fast, lightweight
  • Hazelcast - Distributed, Java-based, compute capabilities

Application-Level Caches​

  • Caffeine - High-performance Java cache library
  • Ehcache - Java cache with disk persistence
  • Guava Cache - Simple in-process cache for Java

CDN/Edge Caches​

  • CloudFlare - Global CDN with edge caching
  • AWS CloudFront - Integrated with AWS services
  • Fastly - Real-time CDN with VCL customization

Distributed Caches​

  • Apache Ignite - Distributed database and cache
  • Aerospike - High-performance distributed cache
  • Couchbase - Document DB with built-in caching

πŸ“š Further Reading​


πŸŽ“ Quick Reference Cheat Sheet​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ WHEN TO USE WHICH PATTERN β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Read-Heavy + Control β†’ Cache-Aside β”‚
β”‚ Strong Consistency β†’ Write-Through β”‚
β”‚ High Write Throughput β†’ Write-Back β”‚
β”‚ Rarely Read After Write β†’ Write-Around β”‚
β”‚ Predictable Hot Data β†’ Refresh-Ahead β”‚
β”‚ Time-Sensitive Data β†’ TTL-Based β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ EVICTION POLICY SELECTION β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Recent = Relevant β†’ LRU β”‚
β”‚ Frequency Matters β†’ LFU β”‚
β”‚ Simple Queue β†’ FIFO β”‚
β”‚ No Pattern / Testing β†’ Random β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜